60 research outputs found
Neural End-to-End Learning for Computational Argumentation Mining
We investigate neural techniques for end-to-end computational argumentation
mining (AM). We frame AM both as a token-based dependency parsing and as a
token-based sequence tagging problem, including a multi-task learning setup.
Contrary to models that operate on the argument component level, we find that
framing AM as dependency parsing leads to subpar performance results. In
contrast, less complex (local) tagging models based on BiLSTMs perform robustly
across classification scenarios, being able to catch long-range dependencies
inherent to the AM problem. Moreover, we find that jointly learning 'natural'
subtasks, in a multi-task learning setup, improves performance.Comment: To be published at ACL 201
Aspect-Controlled Neural Argument Generation
We rely on arguments in our daily lives to deliver our opinions and base them
on evidence, making them more convincing in turn. However, finding and
formulating arguments can be challenging. In this work, we train a language
model for argument generation that can be controlled on a fine-grained level to
generate sentence-level arguments for a given topic, stance, and aspect. We
define argument aspect detection as a necessary method to allow this
fine-granular control and crowdsource a dataset with 5,032 arguments annotated
with aspects. Our evaluation shows that our generation model is able to
generate high-quality, aspect-specific arguments. Moreover, these arguments can
be used to improve the performance of stance detection models via data
augmentation and to generate counter-arguments. We publish all datasets and
code to fine-tune the language model
Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
Argumentation mining (AM) requires the identification of complex discourse
structures and has lately been applied with success monolingually. In this
work, we show that the existing resources are, however, not adequate for
assessing cross-lingual AM, due to their heterogeneity or lack of complexity.
We therefore create suitable parallel corpora by (human and machine)
translating a popular AM dataset consisting of persuasive student essays into
German, French, Spanish, and Chinese. We then compare (i) annotation projection
and (ii) bilingual word embeddings based direct transfer strategies for
cross-lingual AM, finding that the former performs considerably better and
almost eliminates the loss from cross-lingual transfer. Moreover, we find that
annotation projection works equally well when using either costly human or
cheap machine translations. Our code and data are available at
\url{http://github.com/UKPLab/coling2018-xling_argument_mining}.Comment: Accepted at Coling 201
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation
Sentence encoders map sentences to real valued vectors for use in downstream
applications. To peek into these representations - e.g., to increase
interpretability of their results - probing tasks have been designed which
query them for linguistic knowledge. However, designing probing tasks for
lesser-resourced languages is tricky, because these often lack large-scale
annotated data or (high-quality) dependency parsers as a prerequisite of
probing task design in English. To investigate how to probe sentence embeddings
in such cases, we investigate sensitivity of probing task results to structural
design choices, conducting the first such large scale study. We show that
design choices like size of the annotated probing dataset and type of
classifier used for evaluation do (sometimes substantially) influence probing
outcomes. We then probe embeddings in a multilingual setup with design choices
that lie in a 'stable region', as we identify for English, and find that
results on English do not transfer to other languages. Fairer and more
comprehensive sentence-level probing evaluation should thus be carried out on
multiple languages in the future
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
There are two approaches for pairwise sentence scoring: Cross-encoders, which
perform full-attention over the input pair, and Bi-encoders, which map each
input independently to a dense vector space. While cross-encoders often achieve
higher performance, they are too slow for many practical use cases.
Bi-encoders, on the other hand, require substantial training data and
fine-tuning over the target task to achieve competitive performance. We present
a simple yet efficient data augmentation strategy called Augmented SBERT, where
we use the cross-encoder to label a larger set of input pairs to augment the
training data for the bi-encoder. We show that, in this process, selecting the
sentence pairs is non-trivial and crucial for the success of the method. We
evaluate our approach on multiple tasks (in-domain) as well as on a domain
adaptation task. Augmented SBERT achieves an improvement of up to 6 points for
in-domain and of up to 37 points for domain adaptation tasks compared to the
original bi-encoder performance.Comment: Accepted at NAACL 202
Multi-Task Learning for Argumentation Mining in Low-Resource Settings
We investigate whether and where multi-task learning (MTL) can improve
performance on NLP problems related to argumentation mining (AM), in particular
argument component identification. Our results show that MTL performs
particularly well (and better than single-task learning) when little training
data is available for the main task, a common scenario in AM. Our findings
challenge previous assumptions that conceptualizations across AM datasets are
divergent and that MTL is difficult for semantic or higher-level tasks.Comment: Accepted at NAACL 201
The Writing Process in Online Mass Collaboration: NLP-Supported Approaches to Analyzing Collaborative Revision and User Interaction
In the past 15 years, the rapid development of web technologies has created novel ways of collaborative editing. Open online platforms have attracted millions of users from all over the world. The open encyclopedia Wikipedia, started in 2001, has become a very prominent example of a largely successful platform for collaborative editing and knowledge creation. The wiki model has enabled collaboration at a new scale, with more than 30,000 monthly active users on the English Wikipedia.
Traditional writing research deals with questions concerning revision and the writing process itself. The analysis of collaborative writing additionally raises questions about the interaction of the involved authors. Interaction takes place when authors write on the same document (indirect interaction), or when they coordinate the collaborative writing process by means of communication (direct interaction). The study of collaborative writing in online mass collaboration poses several interesting challenges. First and foremost, the writing process in open online collaboration is typically characterized by a large number of revisions from many different authors. Therefore, it is important to understand the interplay and the sequences of different revision categories. As the quality of documents produced in a collaborative writing process varies greatly, the relationship between collaborative revision and document quality is an important field of study. Furthermore, the impact of direct user interaction through background discussions on the collaborative writing process is largely unknown. In this thesis, we tackle these challenges in the context of online mass collaboration, using one of the largest collaboratively created resources, Wikipedia, as our data source. We will also discuss to which extent our conclusions are valid beyond Wikipedia.
We will be dealing with three aspects of collaborative writing in Wikipedia. First, we carry out a content-oriented analysis of revisions in the Wikipedia revision history. This includes the segmentation of article revisions into human-interpretable edits. We develop a taxonomy of edit categories such as spelling error corrections, vandalism or information adding, and verify our taxonomy in an annotation study on a corpus of edits from the English and German Wikipedia. We use the annotated corpora as training data to create models which enable the automatic classification of edits. To show that our model is able to generalize beyond our own data, we train and test it on a second corpus of English Wikipedia revisions. We analyze the distribution of edit categories and frequent patterns in edit sequences within a larger set of article revisions. We also assess the relationship between edit categories and article quality, finding that the information content in high-quality articles tends to become more stable after their promotion and that high-quality articles show a higher degree of homogeneity with respect to frequent collaboration patterns as compared to random articles.
Second, we investigate activity-based roles of users in Wikipedia and how they relate to the collaborative writing process. We automatically classify all revisions in a representative sample of Wikipedia articles and cluster users in this sample into seven intuitive roles. The roles are based on the editing behavior of the users. We find roles such as Vandals, Watchdogs, or All-round Contributors. We also analyze the stability of our discovered roles across time and analyze role transitions. The results show that although the nature of roles remains stable across time, more than half of the users in our sample changed their role between two time periods.
Third, we analyze the correspondence between indirect user interaction through collaborative editing and direct user interaction through background discussion. We analyze direct user interaction using the notion of turns, which has been established in previous work. Turns are snippets from Wikipedia discussion pages. We introduce the notion of corresponding edit-turn-pairs. A corresponding edit-turn-pair consists of a turn and an edit from the same Wikipedia article; the turn forms an explicit performative and the edit corresponds to this performative. This happens, for example, when a user complains about a missing reference in the discussion about an article, and another user adds an appropriate reference to the article itself. We identify the distinctive properties of corresponding edit-turn-pairs and use them to create a model for the automatic detection of corresponding and non-corresponding edit-turn-pairs. We show that the percentage of corresponding edit-turn-pairs in a corpus of flawed English Wikipedia articles is typically below 5% and varies considerably across different articles.
The thesis is concluded with a summary of our main contributions and findings. The growing number of collaborative platforms in commercial applications and education, e.g. in massive open online learning courses, demonstrates the need to understand the collaborative writing process and to support collaborating authors. We also discuss several open issues with respect to the questions addressed in the main parts of the thesis and point out possible directions for future work. Many of the experiments we carried out in the course of this thesis rely on supervised text classification. In the appendix, we explain the concepts and technologies underlying these experiments. We also introduce the DKPro TC framework, which was substantially extended as part of this thesis
The Writing Process in Online Mass Collaboration: NLP-Supported Approaches to Analyzing Collaborative Revision and User Interaction
In the past 15 years, the rapid development of web technologies has created novel ways of collaborative editing. Open online platforms have attracted millions of users from all over the world. The open encyclopedia Wikipedia, started in 2001, has become a very prominent example of a largely successful platform for collaborative editing and knowledge creation. The wiki model has enabled collaboration at a new scale, with more than 30,000 monthly active users on the English Wikipedia.
Traditional writing research deals with questions concerning revision and the writing process itself. The analysis of collaborative writing additionally raises questions about the interaction of the involved authors. Interaction takes place when authors write on the same document (indirect interaction), or when they coordinate the collaborative writing process by means of communication (direct interaction). The study of collaborative writing in online mass collaboration poses several interesting challenges. First and foremost, the writing process in open online collaboration is typically characterized by a large number of revisions from many different authors. Therefore, it is important to understand the interplay and the sequences of different revision categories. As the quality of documents produced in a collaborative writing process varies greatly, the relationship between collaborative revision and document quality is an important field of study. Furthermore, the impact of direct user interaction through background discussions on the collaborative writing process is largely unknown. In this thesis, we tackle these challenges in the context of online mass collaboration, using one of the largest collaboratively created resources, Wikipedia, as our data source. We will also discuss to which extent our conclusions are valid beyond Wikipedia.
We will be dealing with three aspects of collaborative writing in Wikipedia. First, we carry out a content-oriented analysis of revisions in the Wikipedia revision history. This includes the segmentation of article revisions into human-interpretable edits. We develop a taxonomy of edit categories such as spelling error corrections, vandalism or information adding, and verify our taxonomy in an annotation study on a corpus of edits from the English and German Wikipedia. We use the annotated corpora as training data to create models which enable the automatic classification of edits. To show that our model is able to generalize beyond our own data, we train and test it on a second corpus of English Wikipedia revisions. We analyze the distribution of edit categories and frequent patterns in edit sequences within a larger set of article revisions. We also assess the relationship between edit categories and article quality, finding that the information content in high-quality articles tends to become more stable after their promotion and that high-quality articles show a higher degree of homogeneity with respect to frequent collaboration patterns as compared to random articles.
Second, we investigate activity-based roles of users in Wikipedia and how they relate to the collaborative writing process. We automatically classify all revisions in a representative sample of Wikipedia articles and cluster users in this sample into seven intuitive roles. The roles are based on the editing behavior of the users. We find roles such as Vandals, Watchdogs, or All-round Contributors. We also analyze the stability of our discovered roles across time and analyze role transitions. The results show that although the nature of roles remains stable across time, more than half of the users in our sample changed their role between two time periods.
Third, we analyze the correspondence between indirect user interaction through collaborative editing and direct user interaction through background discussion. We analyze direct user interaction using the notion of turns, which has been established in previous work. Turns are snippets from Wikipedia discussion pages. We introduce the notion of corresponding edit-turn-pairs. A corresponding edit-turn-pair consists of a turn and an edit from the same Wikipedia article; the turn forms an explicit performative and the edit corresponds to this performative. This happens, for example, when a user complains about a missing reference in the discussion about an article, and another user adds an appropriate reference to the article itself. We identify the distinctive properties of corresponding edit-turn-pairs and use them to create a model for the automatic detection of corresponding and non-corresponding edit-turn-pairs. We show that the percentage of corresponding edit-turn-pairs in a corpus of flawed English Wikipedia articles is typically below 5% and varies considerably across different articles.
The thesis is concluded with a summary of our main contributions and findings. The growing number of collaborative platforms in commercial applications and education, e.g. in massive open online learning courses, demonstrates the need to understand the collaborative writing process and to support collaborating authors. We also discuss several open issues with respect to the questions addressed in the main parts of the thesis and point out possible directions for future work. Many of the experiments we carried out in the course of this thesis rely on supervised text classification. In the appendix, we explain the concepts and technologies underlying these experiments. We also introduce the DKPro TC framework, which was substantially extended as part of this thesis
ArgumenText: Entscheidungsunterstützung durch die automatische Extraktion von Argumenten aus großen Textquellen (Schlussbericht)
Alle relevanten Gründe für eine Entscheidung zu berücksichtigen ist aufgrund der vorherrschenden Informationsflut u.a. im Internet zunehmend schwieriger. Existierende Technologien wie Suchmaschinen unterstützen den Entscheidungsprozess zwar, die relevanten Argumente in den Dokumenten gehen dabei aber verloren. Das Validierungsprojekt ArgumenText hat sich zum Ziel gesetzt, mittels automatischer Extraktion von Argumenten aus großen Textquellen Entscheidungen oder Wissensgenerierungsprozesse effektiv und effizient zu unterstützen. Dabei konnte auf Methoden zur Argumentextraktion aus vorangegangener Grundlagenforschung, die Behauptungen und Begründungen in einzelnen Textdokumenten erkennen, zurückgegriffen werden. Im Rahmen der VIP+ Validierungsförderung stand insbesondere die Erarbeitung einer vielversprechenden Verwertungsstrategie im Vordergrund. Entsprechend war das ArgumenText Arbeitsprogramm und Projektmanagement ausgerichtet auf die Definition von Anwendungsfällen, in denen eine Methodenadaption und Evaluation der Technologie stattfand. Inhaltlich wurden insbesondere Durchbrüche erzielt bei der Argumentextraktion aus heterogenen Textquellen (es wurde ein flexibles Schema geschaffen, das Pro- und Kontra-Argumente immer mit Bezug auf ein gegebenes Thema definiert) sowie der Argumentextraktion aus massiv großen Datenbeständen (es wurde ein zweistufiges Verfahren geschaffen, welches in Echtzeit zunächst Dokumente nach Relevanz zum Thema und dann innerhalb dieser Dokumente nach passenden Argumenten sucht). Außerdem wurde ein Verfahren zur Argumentgruppierung anhand argumentativer Aspekte entwickelt. Im Rahmen der Validierung wurde der Anwendungsfall Journalismus aufgrund mangelnder Verwertungschancen zugunsten des Anwendungsfall Kaufentscheidung verworfen. Letzterer wurde, getrieben durch die Ergebnisse der Evaluationsstudien, Marktanalysen und rechtlichen Gutachten unterteilt in die Fälle Innovations- und Technologiebewertung sowie Kundenfeedbackanalyse. Zur wirtschaftlichen Verwertung konnte im Rahmen des EXIST Programms des BMWi erfolgreich eine Anschlussfinanzierung für eine Unternehmensgründung eingeworben werden
- …